Multiple‐cumulative probabilities used to cluster and visualize transcriptomes

نویسندگان

  • Xingang Jia
  • Yisu Liu
  • Qiuhong Han
  • Zuhong Lu
چکیده

Analysis of gene expression data by clustering and visualizing played a central role in obtaining biological knowledge. Here, we used Pearson's correlation coefficient of multiple-cumulative probabilities (PCC-MCP) of genes to define the similarity of gene expression behaviors. To answer the challenge of the high-dimensional MCPs, we used icc-cluster, a clustering algorithm that obtained solutions by iterating clustering centers, with PCC-MCP to group genes. We then used t-statistic stochastic neighbor embedding (t-SNE) of KC-data to generate optimal maps for clusters of MCP (t-SNE-MCP-O maps). From the analysis of several transcriptome data sets, we demonstrated clear advantages for using icc-cluster with PCC-MCP over commonly used clustering methods. t-SNE-MCP-O was also shown to give clearly projecting boundaries for clusters of PCC-MCP, which made the relationships between clusters easy to visualize and understand.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cluster-Based Cumulative Ensembles

In this paper, we propose a cluster-based cumulative representation for cluster ensembles. Cluster labels are mapped to incrementally accumulated clusters, and a matching criterion based on maximum similarity is used. The ensemble method is investigated with bootstrap re-sampling, where the k-means algorithm is used to generate high granularity clusterings. For combining, group average hierarch...

متن کامل

Explaining Heterogeneity in Risk Preferences Using a Finite Mixture Model

This paper studies the effect of the space (distance) between lotteries' outcomes on risk-taking behavior and the shape of estimated utility and probability weighting functions. Previously investigated experimental data shows a significant space effect in the gain domain. As compared to low spaced lotteries, high spaced lotteries are associated with higher risk aversion for high probabilities o...

متن کامل

SurvClusVis: Exploring Survival Datasets Using Dimensionality Reduction

Here we present SurvClusVis, an application based on R shiny used to visualize survival data based on tSNE. We first find an appropriate low dimensional representation of the original data using tSNE, which is then clustered using K-means, cluster membership per observation from original data is used to construct survival probabilities by cluster and cluster memberships can be further explored ...

متن کامل

بهینه سازی محاسبه محدوده نواحی خطر در طبقه بندی مناطق خطرناک با رویکرد مبتنی بر ریسک

Introduction: Leakage from process equipment and the entrance of flammable fluids to surrounding atmosphere may cause flammable gas cloud. The coincidence of flammable gas cloud with ignition source could make flash fire or vapor cloud explosion that cause injury and fatality. The concept of reduction of confluence of flammable gas cloud and potential sources of ignition is known as hazardous a...

متن کامل

Cumulative Live-Birth Rates by Maternal Age after One or Multiple In Vitro Fertilization Cycles: An Institutional Experience

Background: The aim of this retrospective study is to investigate the cumulative live birth rate (CLBR) following one or more completed in vitro fertilization (IVF) cycles (up to 6 cycles) stratified by maternal age and type of infertility. Materials and Methods: In this retrospective study, five hundred forty-seven women who received 736 fresh ovarian stimulation/embryo transfer cycles between...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2017